Shared CK component

Description

This is our attempt to share automation actions and best practices as reusable Python modules with a common CLI and Python API to help researchers and practitioners automate their repetitive R&D tasks. Our on-going project is to make the onboarding process as simple as possible via this platform. Please check this CK white paper and don't hesitate to contact us if you have suggestions or feedback!

Automation framework: CK
Development repository: ck-ml
Source: GitHub
How to get the stable version via the CK client:

pip install cbench
cb download module:experiment --version=3.0.0 --all
ck add experiment --help
How to get the development version:

pip install ck
ck pull repo:ck-ml
ck add experiment --help
How to run from Python:
   import ck.kernel as ck

   r=ck.access({'action':'',
                'module_uoa':'',
                ... See JSON API below ...
               })
   if r['return']>0: return r
   ...
Module description: universal experiment entries
API Python code: Link

JSON API:

        "
    Input:  {
              dict                          - format prepared for predictive analytics
                                              {
                                                ("dict")               - add to meta of the entry (useful for subview_uoa, for example)

                                                ("meta")               - coarse grain meta information to distinct entries (species)
                                                ("tags")               - tags (separated by comma)
                                                ("subtags")            - subtags to write to a point

                                                ("dependencies")       - (resolved) dependencies

                                                ("choices")            - choices (for example, optimizations)

                                                ("features")           - species features in points inside entries (mostly unchanged)
                                                                           (may contain state, such as frequency or cache/bus contentions, etc)

                                                "characteristics"      - (dict) species characteristics in points inside entries (measured)
                                                      or
                                                "characteristics_list" - (list) adding multiple experiments at the same time
                                                                         Note: at the end, we only keep characteristics_list
                                                                         and append characteristics to this list...

                                                                         Note, that if a string starts with @@, it should be 
                                                                         of format "@@float_value1,float_value2,...
                                                                         and will be converted into list of values which
                                                                         will be statistically processed as one dimension in time
                                                                         (needed to deal properly with bencmarks like slambench
                                                                         which report kernel times for all frames)

                                                (pipeline_state)       - final state of the pipeline
                                                                          {
                                                                           'repetitions':
                                                                           'fail_reason'
                                                                           'fail'
                                                                           'fail_bool'
                                                                          }

                                                (choices_desc)         - choices descrpition
                                                (features_desc)        - features description
                                                (characteristics_desc) - characteristic description

                                                (pipeline)             - (dict) if experiment from pipeline, record it to be able to reproduce/replay
                                                (pipeline_uoa)         -   if experiment comes from CK pipeline (from some repo), record UOA
                                                (pipeline_uid)         -   if experiment comes from CK pipeline (from some repo), record UOA
                                                                           (to be able to reproduce experiments, test other choices 
                                                                           and improve pipeline by the community/workgroups)
                                                (dict_to_compare)      - flat dict to calculate improvements

                                              }

              (experiment_repo_uoa)         - if defined, use it instead of repo_uoa
                                              (useful for remote repositories)
              (remote_repo_uoa)             - if remote access, use this as a remote repo UOA

              (experiment_uoa)              - if entry with aggregated experiments is already known
              (experiment_uid)              - if entry with aggregated experiments is already known

              (force_new_entry)             - if 'yes', do not search for existing entry,
                                              but add a new one!

              (search_point_by_features)    - if 'yes', find subpoint by features
              (features_keys_to_process)    - list of keys for features (and choices) to process/search (can be wildcards)
                                                   by default ['##features#*', '##choices#*', '##choices_order#*']

              (ignore_update)               - if 'yes', do not record update control info (date, user, etc)

              (sort_keys)                   - if 'yes', sort keys in output json

              (skip_flatten)                - if 'yes', skip flattening and analyzing data (including stat analysis) ...

              (skip_stat_analysis)          - if 'yes', just flatten array and add #min

              (process_multi_keys)          - list of keys (starts with) to perform stat analysis on flat array,
                                              by default ['##characteristics#*', '##features#*' '##choices#*'],
                                              if empty, no stat analysis

              (record_all_subpoints)        - if 'yes', record all subpoints (i.e. do not search and reuse existing points by features)

              (max_range_percent_threshold) - (float) if set, record all subpoints where max_range_percent exceeds this threshold
                                                      useful, to avoid recording too many similar points, but only *unusual* ...

              (record_desc_at_each_point)   - if 'yes', record descriptions for each point and not just an entry.
                                                Useful if descriptions change at each point (say checking all compilers 
                                                for 1 benchmark in one entry - then compiler flags will be changing)

              (record_deps_at_each_point)   - if 'yes', record dependencies for each point and not just an entry.
                                                Useful if descriptions change at each point (say different program may require different libs)

              (record_permanent)            - if 'yes', mark as permanent (to avoid being deleted by Pareto filter)

              (skip_record_pipeline)        - if 'yes', do not record pipeline (to avoid saving too much stuff during crowd-tuning)
              (skip_record_desc)            - if 'yes', do not record desc (to avoid saving too much stuff during crowd-tuning)
            }

    Output: {
              return        - return code =  0, if successful
                                          >  0, if error
              (error)       - error text if return > 0

              update_dict   - dict after updating entry
              dict_flat     - flat dict with stat analysis (if performed)
              stat_analysis - whole output of stat analysis (with warnings)

              flat_features - flat dict of real features of the recorded point (can be later used to search the same points)

              recorded_uid  - UID of a recorded experiment
              point         - recorded point
              sub_point     - recorded subpoint

              elapsed_time  - elapsed time (useful for debugging - to speed up processing of "big data" ;) )
            }

"

Versions

3.0.0 (2021-04-11)
1.0.1 (2020-04-14)
1.0.0 (2019-10-16)

Comments

Please log in to add your comments!

If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!

add experiment

Description Hide

Versions Hide

Comments Hide

Description

Versions

Comments